Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(Gong) - Add transcripts sync #11231

Merged
merged 46 commits into from
Mar 5, 2025
Merged

feat(Gong) - Add transcripts sync #11231

merged 46 commits into from
Mar 5, 2025

Conversation

aubin-tchoi
Copy link
Contributor

@aubin-tchoi aubin-tchoi commented Mar 5, 2025

Description

  • This PR implements the sync of Gong transcripts.

The transcript are all available under a single folder named "Transcripts", which is not selectable.
A table gong_transcripts is added by this PR.

The transcripts are synced as follows:

  • A page of transcripts are fetched using the /calls/transcript endpoint.
  • The corresponding metadata is fetched using the /calls/extensive endpoint.
  • The participants are fetched from connectors' table gong_users, fetching Gong's API if not in db.
  • The data_source_document is created and upserted.

The following tags are attached to the document:

  • title.
  • createdAt.
  • language (ISO-639-2B: eng, fre, spa, ger, and ita).
  • media (audio or video).
  • scope (Internal, External).
  • direction (Inbound, Outbound, Conference).
  • the participant's emails if we have them.

Tests

  • Tested locally.

Risk

  • N/A.

Deploy Plan

  • Apply migration_60.sql.
  • Deploy connectors.
  • Deploy front.

Copy link

github-actions bot commented Mar 5, 2025

Warnings
⚠️

Files in **/lib/models/ have been modified and the PR has the migration-ack label. Don't forget to run the migration from prodbox.

Generated by 🚫 dangerJS against c2fcece

@aubin-tchoi aubin-tchoi added the migration-ack 📁 Label to acknowledge that a migration is required. label Mar 5, 2025
@aubin-tchoi aubin-tchoi force-pushed the gong-transcripts branch 2 times, most recently from 3ae9bdc to f31e5ba Compare March 5, 2025 15:57
# Conflicts:
#	connectors/src/connectors/gong/lib/gong_api.ts
#	connectors/src/connectors/gong/temporal/activities.ts
@aubin-tchoi aubin-tchoi requested a review from flvndvd March 5, 2025 17:30
Copy link
Contributor

@flvndvd flvndvd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once comments have been addressed 🔥

"createdAt" TIMESTAMP WITH TIME ZONE NOT NULL,
"updatedAt" TIMESTAMP WITH TIME ZONE NOT NULL,
"callId" TEXT NOT NULL,
"title" TEXT NOT NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can title be bigger than 255 characters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have occurrences in our gong but I would assume it's not really bounded

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's truncate in the resource then.

Comment on lines 13 to 14
CREATE INDEX "gong_transcripts_connector_id" ON "gong_transcripts" ("connectorId");
CREATE UNIQUE INDEX "gong_transcripts_connector_id_call_id" ON "gong_transcripts" ("connectorId", "callId");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need the first index, the second one (which is a composite index) will also be used when querying only on connectorId.

folderId: makeGongTranscriptFolderInternalId(connector),
parents: [makeGongTranscriptFolderInternalId(connector)],
parentId: null,
title: "Transcripts",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit
No magic variable.

export const GongParticipantCodec = t.intersection([
t.type({
speakerId: t.union([t.string, t.null]),
userId: t.union([t.string, t.null, t.undefined]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure it can both null or undefined? Feels like a very strange API 😅.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mb it can only be undefined, forgot to clean this up

started: t.string,
duration: t.number,
title: t.string,
media: t.union([t.literal("Video"), t.literal("Audio")]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest removing some fields from here and only keeping the one we use to avoid validation error.

await upsertDataSourceDocument({
dataSourceConfig,
documentId,
documentContent: await renderDocumentTitleAndContent({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit No await in object, let's extract them.

`language:${transcriptMetadata.metaData.language}`, // The language codes (as defined by ISO-639-2B): eng, fre, spa, ger, and ita.
`media:${transcriptMetadata.metaData.media}`,
`scope:${transcriptMetadata.metaData.scope}`,
...participants.map((p) => p.email),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this lives out the participant that are not internal to the connected Gong's workspace. Can we still include them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm in any case these participants are retrieved using getGongUsers, meaning they are exposed by Gong's API so I would say it's fine since they are not local to the meeting and exist somewhere in gong

Comment on lines 125 to 126
{ fields: ["connectorId"] },
{ fields: ["connectorId", "callId"], unique: true },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, one composite index serves the two needs.

Suggested change
{ fields: ["connectorId"] },
{ fields: ["connectorId", "callId"], unique: true },
{ fields: ["connectorId", "callId"], unique: true },

@@ -32,6 +36,14 @@ export class GongConnectorStrategy
connector: ConnectorResource,
transaction: Transaction
): Promise<void> {
await GongUserModel.destroy({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏


toJSON(): Record<string, unknown> {
return {
id: this.id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Let's add the Gong call id, no?

@aubin-tchoi aubin-tchoi merged commit 1eaaff9 into main Mar 5, 2025
6 checks passed
@aubin-tchoi aubin-tchoi deleted the gong-transcripts branch March 5, 2025 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
migration-ack 📁 Label to acknowledge that a migration is required.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants